Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add action input as parameters for tool execution in conversational agent #3200

Merged
merged 3 commits into from
Dec 31, 2024

Conversation

jngz-es
Copy link
Collaborator

@jngz-es jngz-es commented Nov 4, 2024

Description

Related Issues

Resolves #[Issue number to be closed when this PR is merged]
#3134

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

mingshl
mingshl previously approved these changes Nov 14, 2024
@mingshl
Copy link
Collaborator

mingshl commented Nov 14, 2024

tests passed but failed in upload, not related to this code change. Approved.

Run actions/upload-artifact@v4
/usr/bin/docker exec  a6c6ef6ad38d2e9993d03ba2bdc50f2146f892a4a109fc9fecaf2c66802943f0 sh -c "cat /etc/*release | grep ^ID"
/__e/node[20](https://github.com/opensearch-project/ml-commons/actions/runs/11670434010/job/32954019963?pr=3200#step:8:21)/bin/node: /lib64/libm.so.6: version `GLIBC_2.27' not found (required by /__e/node20/bin/node)
/__e/node20/bin/node: /lib64/libc.so.6: version `GLIBC_2.28' not found (required by /__e/node20/bin/node)
``'

@reuschling
Copy link

reuschling commented Nov 18, 2024

The changes for AgentUtils look fine, but is AgentUtils used for conversational agents? The tool parameters are build inside MLConversationalFlowAgentRunner.getToolExecuteParams(MLToolSpec toolSpec, Map<String, String> params), i.e. here. There is no AgentUtils invocation. The only invocation of AgentUtils.‎‎constructToolParams I found is inside MLChatAgentRunner.

I showed a code proposal for getToolExecuteParams at #2977 (comment), the only difference is that there is no dedicated actionInput parameter, the actionInput is the "input" entry so far, which has to be temporarily stored inside a local variable.

I would also highly recommend to add the new "action_input" parameter to flow agents (i.e. MLFlowAgentRunner.getToolExecuteParams) also. There is no use of AgentUtils too. Of course there is the possibility with parameters.previous_tool_name.output, but tool specifications should act the same independent where they should be used, whether inside flow or conversational agents.

@jngz-es
Copy link
Collaborator Author

jngz-es commented Dec 2, 2024

The changes for AgentUtils look fine, but is AgentUtils used for conversational agents? The tool parameters are build inside MLConversationalFlowAgentRunner.getToolExecuteParams(MLToolSpec toolSpec, Map<String, String> params), i.e. here. There is no AgentUtils invocation. The only invocation of AgentUtils.‎‎constructToolParams I found is inside MLChatAgentRunner.

I showed a code proposal for getToolExecuteParams at #2977 (comment), the only difference is that there is no dedicated actionInput parameter, the actionInput is the "input" entry so far, which has to be temporarily stored inside a local variable.

I would also highly recommend to add the new "action_input" parameter to flow agents (i.e. MLFlowAgentRunner.getToolExecuteParams) also. There is no use of AgentUtils too. Of course there is the possibility with parameters.previous_tool_name.output, but tool specifications should act the same independent where they should be used, whether inside flow or conversational agents.

Hi @reuschling , thanks for the comments. Yeah, you are right. The changes is only for conversational agents. As you also mentioned, the parameters.previous_tool_name.output is designed as action input for flow agents which is a sequence of tools. I don't see a use case of flow agent where the parameters.previous_tool_name.output could not meet the requirement but the action input did.

@jngz-es jngz-es temporarily deployed to ml-commons-cicd-env December 2, 2024 18:57 — with GitHub Actions Inactive
@jngz-es jngz-es had a problem deploying to ml-commons-cicd-env December 2, 2024 21:23 — with GitHub Actions Failure
@jngz-es jngz-es temporarily deployed to ml-commons-cicd-env December 2, 2024 22:24 — with GitHub Actions Inactive
@jngz-es jngz-es temporarily deployed to ml-commons-cicd-env December 2, 2024 23:21 — with GitHub Actions Inactive
@reuschling
Copy link

Hi @reuschling , thanks for the comments. Yeah, you are right. The changes is only for conversational agents.

Sorry, but I think you misunderstood me. Currently this PR makes NO change to conversational agents. It changes AgentUtils which is ONLY invoked inside MLChatAgentRunner. For conversational agents, you have to modify MLConversationalFlowAgentRunner.

As you also mentioned, the parameters.previous_tool_name.output is designed as action input for flow agents which is a sequence of tools. I don't see a use case of flow agent where the parameters.previous_tool_name.output could not meet the requirement but the action input did.

Exactly. But when someone wants to use the same tool inside a conversational agent and inside a flow agent, he/she has to change the tool definition from parameters.previous_tool_name.output to action input or vice versa. Why giving the same thing different names? It's not for more functionality, but for logic and design purposes.

@ylwu-amzn
Copy link
Collaborator

ylwu-amzn commented Dec 4, 2024

@reuschling Thanks for reviewing this PR. From #2918 (comment), I think you know "MLChatAgentRunner uses the AgentUtils method AgentUtils.constructToolParams for generating the params for a tool."

The name seems confusing, but actually MLChatAgentRunner is for conversational agent (code link).
Refer to this tutorial for differences between different agent types.

Test

I have tested this PR with Bedrock anthropic.claude-instant-v1 model.

  1. You should create a test_population_data index , which has a text field population_description first. Refer to this tutorial for creating this index.
  2. Need to configure config of SearchIndexTool. Add a static input template with placeholder llm_generated_action_input. The placeholder llm_generated_action_input will be substituted with input generated by LLM.
POST _plugins/_ml/agents/_register
{
    "name": "Test Agent",
    "type": "conversational",
    "description": "Simple agent to test the agent framework",
    "llm": {
        "model_id": "<your LLM model id>",
        "parameters": {
            "max_iteration": 5,
            "stop_when_no_tool_found": true,
            "disable_trace": false
        }
    },
    "memory": {
        "type": "conversation_index"
    },
    "app_type": "chat_with_rag",
    "tools": [
        {
            "type": "SearchIndexTool",
            "description": "A tool to search opensearch index with natural language question. If you don't know answer for some question, you should always try to search data with this tool. Action Input: <natural language question>",
            "include_output_in_agent_response": true,
            "config": {
                "input": "{\"index\": \"test_population_data\", \"query\": {\"query\":{\"match\":{\"population_description\":\"${parameters.llm_generated_action_input}\"}}} }"
            }
        }
    ]
}

Test agent with

{
  "parameters": {
    "question": "what's the population increase of Seattle from 2021 to 2023?"
  }
}

Feel free to test. BTW, you can find @jngz-es and me in the public OpenSearch ml Slack channel https://join.slack.com/t/opensearch/shared_invite/zt-2r5scz3ty-5SMPhqJE_Lk2HqC6ex4mWg, welcome to join. I'm ok to jump to a call if that's easier to explain details with a demo.

@@ -472,6 +472,11 @@ public static Map<String, String> constructToolParams(
if (toolSpecConfigMap != null) {
toolParams.putAll(toolSpecConfigMap);
}
toolParams.put("llm_generated_action_input", actionInput);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe no need to mention action explicitly considering REST API uses tool. User may feel confused about tool and action. How about just llm_generated_input ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can also consider using constant for this string since it is reused in the tests

@reuschling
Copy link

@reuschling Thanks for reviewing this PR. From #2918 (comment), I think you know "MLChatAgentRunner uses the AgentUtils method AgentUtils.constructToolParams for generating the params for a tool."

The name seems confusing, but actually MLChatAgentRunner is for conversational agent (code link).

Thanks a lot @ylwu-amzn for clarification with the code link. Yes the name confused me, sorry about that @jngz-es . In this case I'm also fine with the changes :)

@jngz-es jngz-es temporarily deployed to ml-commons-cicd-env December 30, 2024 22:22 — with GitHub Actions Inactive
@jngz-es jngz-es had a problem deploying to ml-commons-cicd-env December 30, 2024 22:22 — with GitHub Actions Failure
@jngz-es jngz-es had a problem deploying to ml-commons-cicd-env December 30, 2024 23:23 — with GitHub Actions Failure
@jngz-es jngz-es requested a review from ylwu-amzn December 30, 2024 23:24
@jngz-es jngz-es temporarily deployed to ml-commons-cicd-env December 31, 2024 00:49 — with GitHub Actions Inactive
@jngz-es jngz-es temporarily deployed to ml-commons-cicd-env December 31, 2024 01:45 — with GitHub Actions Inactive
@jngz-es jngz-es merged commit c850eef into opensearch-project:main Dec 31, 2024
9 checks passed
opensearch-trigger-bot bot pushed a commit that referenced this pull request Dec 31, 2024
…gent (#3200)

* add llm generated action input as parameters for tool execution in conversational agent

Signed-off-by: Jing Zhang <[email protected]>

* add UT for null action input

Signed-off-by: Jing Zhang <[email protected]>

* change llm_generated_action_input to llm_generated_input

Signed-off-by: Jing Zhang <[email protected]>

---------

Signed-off-by: Jing Zhang <[email protected]>
(cherry picked from commit c850eef)
jngz-es added a commit that referenced this pull request Dec 31, 2024
…gent (#3200) (#3314)

* add llm generated action input as parameters for tool execution in conversational agent

Signed-off-by: Jing Zhang <[email protected]>

* add UT for null action input

Signed-off-by: Jing Zhang <[email protected]>

* change llm_generated_action_input to llm_generated_input

Signed-off-by: Jing Zhang <[email protected]>

---------

Signed-off-by: Jing Zhang <[email protected]>
(cherry picked from commit c850eef)

Co-authored-by: Jing Zhang <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants